NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Physics-informed machine learning for automatic model reduction in chemical reaction networks

https://doi.org/10.1038/s41598-025-92680-8

Pateras, Joseph; Zhang, Colin; Majumdar, Shriya; Pal, Ayush; Ghosh, Preetam (December 2025, Scientific Reports)

Abstract Physics-informed machine learning bridges the gap between the high fidelity of mechanistic models and the adaptive insights of artificial intelligence. In chemical reaction network modeling, this synergy proves valuable, addressing the high computational costs of detailed mechanistic models while leveraging the predictive power of machine learning. This study applies this fusion to the biomedical challenge of A$$\beta$$fibril aggregation, a key factor in Alzheimer’s disease. Central to the research is the introduction of an automatic reaction order model reduction framework, designed to optimize reduced-order kinetic models. This framework represents a shift in model construction, automatically determining the appropriate level of detail for reaction network modeling. The proposed approach significantly improves simulation efficiency and accuracy, particularly in systems like A$$\beta$$aggregation, where precise modeling of nucleation and growth kinetics can reveal potential therapeutic targets. Additionally, the automatic model reduction technique has the potential to generalize to other network models. The methodology offers a scalable and adaptable tool for applications beyond biomedical research. Its ability to dynamically adjust model complexity based on system-specific needs ensures that models remain both computationally feasible and scientifically relevant, accommodating new data and evolving understandings of complex phenomena.
more » « less
Full Text Available
A survey on deep learning for drug-target binding prediction: models, benchmarks, evaluation, and case studies

https://doi.org/10.1093/bib/bbaf491

Debnath, Kusal; Rana, Pratip; Ghosh, Preetam (August 2025, Briefings in Bioinformatics)

Abstract Conventional drug discovery is expensive, time-consuming, and prone to failure. Artificial intelligence has become a potent substitute over the last decade, providing strong answers to challenging biological issues in this field. Among these difficulties, drug-target binding (DTB) is a key component of drug discovery techniques. In this context, drug-target affinity and drug–target interaction are complementary and essential frameworks that work together to improve our comprehension of DTB dynamics. In this work, we thoroughly analyze the most recent deep learning models, popular benchmark datasets, and assessment metrics for DTB prediction. We look at the paradigm shift in the development of drug discovery research since researchers started using deep learning as a potent tool for DTB prediction. In particular, we examine how methodologies have evolved, starting with early heterogeneous network-based approaches, progressing to graph-based approaches that were widely accepted, followed by modern attention-based architectures, and finally, the most recent multimodal approaches. We also provide case studies utilizing an extensive compound library against specific protein targets implicated in critical cancer pathways to demonstrate the usefulness of these approaches. In addition to summarizing the latest developments in DTB prediction models, this review also identifies their drawbacks. It also highlights the outlook for the DTB prediction domain and future research directions. Combined, these studies present a more comprehensive view of how deep learning offers a quantitative framework for researching drug-target relationships, speeding up the identification of new drug candidates and making it easier to identify possible DTBs.
more » « less
Full Text Available
Network and modeling analysis of MAPK signaling cascade uncovers EGR1 regulation through ERK2 protein in breast cancer

https://doi.org/10.1016/j.compbiomed.2025.110830

Pavithran, Honey; Ghosh, Preetam; Kumavath, Ranjith (July 2025, Computers in biology and medicine)

Full Text Available
Advancing infection profiling under data uncertainty through contagion potential

https://doi.org/10.1371/journal.pone.0329828

Roy, Satyaki; Biswas, Preetom; Ghosh, Preetam (August 2025, PLOS One)
Arunachalam, Viswanathan (Ed.)
During the COVID-19 pandemic, the prevalence of asymptomatic cases challenged the reliability of epidemiological statistics in policymaking. To address this, we introducedcontagion potential(CP) as a continuous metric derived from sociodemographic and epidemiological data to quantify the infection risk posed by the asymptomatic within a region. However, CP estimation is hindered by incomplete or biased incidence data, where underreporting and testing constraints make direct estimation infeasible. To overcome this limitation, we employ a hypothesis-testing approach to infer CP from sampled data, allowing for robust estimation despite missing information. Even within the sample collected from spatial contact data, individuals possess partial knowledge of their neighborhoods, as their awareness is restricted to interactions captured by available tracking data. We introduce an adjustment factor that calibrates the sample CPs so that the sample is a reasonable estimate of the population CP. Further complicating estimation, biases in epidemiological and mobility data arise from heterogeneous reporting rates and sampling inconsistencies, which we address throughinverse probability weightingto enhance reliability. Using a spatial model for infection spread through social mixing and an optimization framework based on the SIRS epidemic model, we analyze real infection datasets from Italy, Germany, and Austria. Our findings demonstrate that statistical methods can achieve high-confidence CP estimates while accounting for variations in sample size, confidence level, mobility models, and viral strains. By assessing the effects of bias, social mixing, and sampling frequency, we propose statistical corrections to improve CP prediction accuracy. Finally, we discuss how reliable CP estimates can inform outbreak mitigation strategies despite the inherent uncertainties in epidemiological data.
more » « less
Full Text Available
GramSeq-DTA: A Grammar-Based Drug–Target Affinity Prediction Approach Fusing Gene Expression Information

https://doi.org/10.3390/biom15030405

Debnath, Kusal; Rana, Pratip; Ghosh, Preetam (March 2025, Biomolecules)

Drug–target affinity (DTA) prediction is a critical aspect of drug discovery. The meaningful representation of drugs and targets is crucial for accurate prediction. Using 1D string-based representations for drugs and targets is a common approach that has demonstrated good results in drug–target affinity prediction. However, these approach lacks information on the relative position of the atoms and bonds. To address this limitation, graph-based representations have been used to some extent. However, solely considering the structural aspect of drugs and targets may be insufficient for accurate DTA prediction. Integrating the functional aspect of these drugs at the genetic level can enhance the prediction capability of the models. To fill this gap, we propose GramSeq-DTA, which integrates chemical perturbation information with the structural information of drugs and targets. We applied a Grammar Variational Autoencoder (GVAE) for drug feature extraction and utilized two different approaches for protein feature extraction as follows: a Convolutional Neural Network (CNN) and a Recurrent Neural Network (RNN). The chemical perturbation data are obtained from the L1000 project, which provides information on the up-regulation and down-regulation of genes caused by selected drugs. This chemical perturbation information is processed, and a compact dataset is prepared, serving as the functional feature set of the drugs. By integrating the drug, gene, and target features in the model, our approach outperforms the current state-of-the-art DTA prediction models when validated on widely used DTA datasets (BindingDB, Davis, and KIBA). This work provides a novel and practical approach to DTA prediction by merging the structural and functional aspects of biological entities, and it encourages further research in multi-modal DTA prediction.
more » « less
Full Text Available
Heterogeneous Clustering of Multiomics Data for Breast Cancer Subgroup Classification and Detection

https://doi.org/10.3390/ijms26041707

Pateras, Joseph; Lodi, Musaddiq; Rana, Pratip; Ghosh, Preetam (February 2025, International Journal of Molecular Sciences)

The rapid growth of diverse -omics datasets has made multiomics data integration crucial in cancer research. This study adapts the expectation–maximization routine for the joint latent variable modeling of multiomics patient profiles. By combining this approach with traditional biological feature selection methods, this study optimizes latent distribution, enabling efficient patient clustering from well-studied cancer types with reduced computational expense. The proposed optimization subroutines enhance survival analysis and improve runtime performance. This article presents a framework for distinguishing cancer subtypes and identifying potential biomarkers for breast cancer. Key insights into individual subtype expression and function were obtained through differentially expressed gene analysis and pathway enrichment for BRCA patients. The analysis compared 302 tumor samples to 113 normal samples across 60,660 genes. The highly upregulated gene COL10A1, promoting breast cancer progression and poor prognosis, and the consistently downregulated gene CDG300LG, linked to brain metastatic cancer, were identified. Pathway enrichment analysis revealed similarities in cellular matrix organization pathways across subtypes, with notable differences in functions like cell proliferation regulation and endocytosis by host cells. GO Semantic Similarity analysis quantified gene relationships in each subtype, identifying potential biomarkers like MATN2, similar to COL10A1. These insights suggest deeper relationships within clusters and highlight personalized treatment potential based on subtypes.
more » « less
Full Text Available
Improved KD-tree based imbalanced big data classification and oversampling for MapReduce platforms

https://doi.org/10.1007/s10489-024-05763-w

Sleeman, William C; Roseberry, Martha; Ghosh, Preetam; Cano, Alberto; Krawczyk, Bartosz (December 2024, Applied Intelligence)

Full Text Available
COFFEE: consensus single cell-type specific inference for gene regulatory networks

https://doi.org/10.1093/bib/bbae457

Lodi, Musaddiq K; Chernikov, Anna; Ghosh, Preetam (September 2024, Briefings in Bioinformatics)

Abstract The inference of gene regulatory networks (GRNs) is crucial to understanding the regulatory mechanisms that govern biological processes. GRNs may be represented as edges in a graph, and hence, it have been inferred computationally for scRNA-seq data. A wisdom of crowds approach to integrate edges from several GRNs to create one composite GRN has demonstrated improved performance when compared with individual algorithm implementations on bulk RNA-seq and microarray data. In an effort to extend this approach to scRNA-seq data, we present COFFEE (COnsensus single cell-type speciFic inFerence for gEnE regulatory networks), a Borda voting-based consensus algorithm that integrates information from 10 established GRN inference methods. We conclude that COFFEE has improved performance across synthetic, curated, and experimental datasets when compared with baseline methods. Additionally, we show that a modified version of COFFEE can be leveraged to improve performance on newer cell-type specific GRN inference methods. Overall, our results demonstrate that consensus-based methods with pertinent modifications continue to be valuable for GRN inference at the single cell level. While COFFEE is benchmarked on 10 algorithms, it is a flexible strategy that can incorporate any set of GRN inference algorithms according to user preference. A Python implementation of COFFEE may be found on GitHub: https://github.com/lodimk2/coffee
more » « less
Full Text Available
An effective drift-diffusion model for pandemic propagation and uncertainty prediction

https://doi.org/10.1016/j.bpr.2024.100182

Bender, Clara; Ghosh, Abhimanyu; Vakili, Hamed; Ghosh, Preetam; Ghosh, Avik W (December 2024, Biophysical Reports)

Full Text Available
A Survey on Data-Driven Approaches for Reliability, Robustness, and Energy Efficiency in Wireless Body Area Networks

https://doi.org/10.3390/s24206531

Majumdar, Pulak; Roy, Satyaki; Sikdar, Sudipta; Ghosh, Preetam; Ghosh, Nirnay (October 2024, Sensors)

Wireless Body Area Networks (WBANs) are pivotal in health care and wearable technologies, enabling seamless communication between miniature sensors and devices on or within the human body. These biosensors capture critical physiological parameters, ranging from body temperature and blood oxygen levels to real-time electrocardiogram readings. However, WBANs face significant challenges during and after deployment, including energy conservation, security, reliability, and failure vulnerability. Sensor nodes, which are often battery-operated, expend considerable energy during sensing and transmission due to inherent spatiotemporal patterns in biomedical data streams. This paper provides a comprehensive survey of data-driven approaches that address these challenges, focusing on device placement and routing, sampling rate calibration, and the application of machine learning (ML) and statistical learning techniques to enhance network performance. Additionally, we validate three existing models (statistical, ML, and coding-based models) using two real datasets, namely the MIMIC clinical database and biomarkers collected from six subjects with a prototype biosensing device developed by our team. Our findings offer insights into strategies for optimizing energy efficiency while ensuring security and reliability in WBANs. We conclude by outlining future directions to leverage approaches to meet the evolving demands of healthcare applications.
more » « less
Full Text Available

« Prev Next »

Search for: All records